{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "635d8ebb",
   "metadata": {},
   "source": [
    "# DatetimeOutputParser\n",
    "\n",
    "- Author: [Donghak Lee](https://github.com/stsr1284)\n",
    "- Peer Review : [JaeHo Kim](https://github.com/Jae-hoya), [ranian963](https://github.com/ranian963)\n",
    "- Proofread : [Two-Jay](https://github.com/Two-Jay)\n",
    "- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/03-OutputParser/06-DatetimeOutputParser.ipynb)[![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/03-OutputParser/06-DatetimeOutputParser.ipynb)\n",
    "## Overview\n",
    "\n",
    "The ```DatetimeOutputParser``` is an output parser that generates structured outputs in the form of ```datetime``` objects.\n",
    "\n",
    "By converting the outputs of LLMs into ```datetime``` objects, it enables more systematic and consistent processing of date and time data, making it useful for data processing and analysis.\n",
    "\n",
    "This tutorial demonstrates how to use the ```DatetimeOutputParser``` to:\n",
    "1. Set up and initialize the parser for ```datetime``` generation\n",
    "2. Convert a ```datetime``` object to a string\n",
    "\n",
    "### Table of Contents\n",
    "\n",
    "- [Overview](#overview)\n",
    "- [Environment Setup](#environment-setup)\n",
    "- [Using the DatetimeOutputParser](#using-the-datetimeoutputparser)\n",
    "- [Using DatetimeOutputParser in astream](#using-datetimeoutputparser-in-astream)\n",
    "\n",
    "\n",
    "### References\n",
    "\n",
    "- [LangChain DatetimeOutputParser](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.datetime.DatetimeOutputParser.html)\n",
    "- [LangChain ChatOpenAI API reference](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html)\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c6c7aba4",
   "metadata": {},
   "source": [
    "## Environment Setup\n",
    "\n",
    "Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.\n",
    "\n",
    "**[Note]**\n",
    "- ```langchain-opentutorial``` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. \n",
    "- You can checkout the [```langchain-opentutorial```](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "21943adb",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
    "%pip install langchain-opentutorial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "f25ec196",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install required packages\n",
    "from langchain_opentutorial import package\n",
    "\n",
    "package.install(\n",
    "    [\n",
    "        \"langchain\",\n",
    "        \"langchain_core\",\n",
    "        \"langchain_openai\",\n",
    "    ],\n",
    "    verbose=False,\n",
    "    upgrade=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "7f9065ea",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Environment variables have been set successfully.\n"
     ]
    }
   ],
   "source": [
    "# Set environment variables\n",
    "from langchain_opentutorial import set_env\n",
    "\n",
    "set_env(\n",
    "    {\n",
    "        \"OPENAI_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_TRACING_V2\": \"true\",\n",
    "        \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n",
    "        \"LANGCHAIN_PROJECT\": \"06-DatetimeOutputParser\",\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "690a9ae0",
   "metadata": {},
   "source": [
    "You can alternatively set ```OPENAI_API_KEY``` in ```.env``` file and load it.\n",
    "\n",
    "[Note] This is not necessary if you've already set ```OPENAI_API_KEY``` in previous steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "4f99b5b6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv(override=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9760b5f",
   "metadata": {},
   "source": [
    "## Using the ```DatetimeOutputParser```\n",
    "If you need to generate output in the form of a date or time, the ```DatetimeOutputParser``` from LangChain simplifies the process.\n",
    "\n",
    "The **format of the ```DatetimeOutputParser```** can be specified by referring to the table below.\n",
    "| Format Code | Description           | Example              |\n",
    "|--------------|-----------------------|----------------------|\n",
    "| %Y           | 4-digit year          | 2024                 |\n",
    "| %y           | 2-digit year          | 24                   |\n",
    "| %m           | 2-digit month         | 07                   |\n",
    "| %d           | 2-digit day           | 04                   |\n",
    "| %H           | 24-hour format hour   | 14                   |\n",
    "| %I           | 12-hour format hour   | 02                   |\n",
    "| %p           | AM or PM              | PM                   |\n",
    "| %M           | 2-digit minute        | 45                   |\n",
    "| %S           | 2-digit second        | 08                   |\n",
    "| %f           | Microsecond (6 digits)| 000123               |\n",
    "| %z           | UTC offset            | +0900                |\n",
    "| %Z           | Timezone name         | KST                  |\n",
    "| %a           | Abbreviated weekday   | Thu                  |\n",
    "| %A           | Full weekday name     | Thursday             |\n",
    "| %b           | Abbreviated month     | Jul                  |\n",
    "| %B           | Full month name       | July                 |\n",
    "| %c           | Full date and time    | Thu Jul 4 14:45:08 2024 |\n",
    "| %x           | Full date             | 07/04/24             |\n",
    "| %X           | Full time             | 14:45:08             |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "69cb77da",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Write a datetime string that matches the following pattern: '%Y-%m-%d'.\n",
      "\n",
      "Examples: 0594-05-12, 0088-08-25, 0371-10-02\n",
      "\n",
      "Return ONLY this string, no other words!\n",
      "-----------------------------------------------\n",
      "\n",
      "input_variables=['question'] input_types={} partial_variables={'format_instructions': \"Write a datetime string that matches the following pattern: '%Y-%m-%d'.\\n\\nExamples: 0594-05-12, 0088-08-25, 0371-10-02\\n\\nReturn ONLY this string, no other words!\"} template='Answer the users question:\\n\\n#Format Instructions: \\n{format_instructions}\\n\\n#Question: \\n{question}\\n\\n#Answer:'\n"
     ]
    }
   ],
   "source": [
    "from langchain.output_parsers import DatetimeOutputParser\n",
    "from langchain_core.prompts import PromptTemplate\n",
    "\n",
    "# Initialize the output parser\n",
    "output_parser = DatetimeOutputParser()\n",
    "\n",
    "# Specify date format\n",
    "date_format = \"%Y-%m-%d\"\n",
    "output_parser.format = date_format\n",
    "\n",
    "# Get format instructions\n",
    "format_instructions = output_parser.get_format_instructions()\n",
    "\n",
    "# Create answer template for user questions\n",
    "template = \"\"\"Answer the users question:\\n\\n#Format Instructions: \\n{format_instructions}\\n\\n#Question: \\n{question}\\n\\n#Answer:\"\"\"\n",
    "\n",
    "# Create a prompt from the template\n",
    "prompt = PromptTemplate.from_template(\n",
    "    template,\n",
    "    partial_variables={\n",
    "        \"format_instructions\": format_instructions,\n",
    "    },  # Use parser's format instructions\n",
    ")\n",
    "\n",
    "print(format_instructions)\n",
    "print(\"-----------------------------------------------\\n\")\n",
    "print(prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "4fc39071",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1998-09-04 00:00:00\n",
      "<class 'datetime.datetime'>\n"
     ]
    }
   ],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "model = ChatOpenAI(temperature=0, model_name=\"gpt-4o-mini\")\n",
    "\n",
    "# Combine the prompt, chat model, and output parser into a chain\n",
    "chain = prompt | model | output_parser\n",
    "\n",
    "# Call the chain to get an answer to the question\n",
    "output = chain.invoke({\"question\": \"The year Google was founded\"})\n",
    "\n",
    "print(output)\n",
    "print(type(output))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "6cf66464",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1998-09-04'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert the result to a string\n",
    "output.strftime(date_format)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b0540cc",
   "metadata": {},
   "source": [
    "## Using ```DatetimeOutputParser``` in ```astream```\n",
    "Refer to the [user-defined generator](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/13-LangChain-Expression-Language/09-Generator.ipynb) to create a generator function.\n",
    "\n",
    "Let's create a simple example that converts ```astream``` output to ```datetime``` objects using a generator function.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7080da4c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.output_parsers.string import StrOutputParser\n",
    "from langchain.output_parsers.datetime import DatetimeOutputParser\n",
    "from langchain_core.prompts.prompt import PromptTemplate\n",
    "from langchain_openai.chat_models.base import ChatOpenAI\n",
    "import datetime\n",
    "from typing import AsyncIterator, List\n",
    "\n",
    "# Initialize the output parser\n",
    "output_parser = DatetimeOutputParser()\n",
    "\n",
    "# Specify date format\n",
    "date_format = \"%Y-%m-%d\"\n",
    "output_parser.format = date_format\n",
    "\n",
    "# Get format instructions\n",
    "format_instructions = output_parser.get_format_instructions()\n",
    "\n",
    "# Create answer template for user questions\n",
    "template = (\n",
    "    \"Answer the users question:\\n\\n\"\n",
    "    \"#Format Instructions: \\n{format_instructions}\\n\"\n",
    "    \"Write a comma-separated list of 5 founding years of companies similar to: {company}\"\n",
    ")\n",
    "\n",
    "# Create a prompt from the template\n",
    "prompt = PromptTemplate.from_template(\n",
    "    template,\n",
    "    partial_variables={\"format_instructions\": format_instructions},\n",
    ")\n",
    "\n",
    "# Initialize the ChatOpenAI model with temperature set to 0.0\n",
    "model = ChatOpenAI(temperature=0.0, model_name=\"gpt-4o-mini\")\n",
    "\n",
    "# Create a chain combining the prompt, model, and string output parser\n",
    "str_chain = prompt | model | StrOutputParser()\n",
    "\n",
    "\n",
    "# Define an asynchronous function to convert strings to datetime objects\n",
    "async def convert_strings_to_datetime(\n",
    "    input: AsyncIterator[str],\n",
    ") -> AsyncIterator[List[datetime.datetime]]:\n",
    "    buffer = \"\"\n",
    "    async for chunk in input:\n",
    "        buffer += chunk\n",
    "        while \",\" in buffer:\n",
    "            comma_index = buffer.index(\",\")\n",
    "            date_str = buffer[:comma_index].strip()\n",
    "            date_obj = output_parser.parse(date_str)  # Convert to datetime object\n",
    "            yield [date_obj]\n",
    "            buffer = buffer[comma_index + 1 :]\n",
    "    date_str = buffer.strip()\n",
    "    if date_str:\n",
    "        date_obj = output_parser.parse(\n",
    "            date_str\n",
    "        )  # Convert remaining buffer to datetime object\n",
    "        yield [date_obj]\n",
    "\n",
    "\n",
    "# Connect the str_chain and convert_strings_to_datetime in a pipeline\n",
    "alist_chain = str_chain | convert_strings_to_datetime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "2434af42",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[datetime.datetime(1998, 9, 4, 0, 0)]\n",
      "[datetime.datetime(2004, 2, 4, 0, 0)]\n",
      "[datetime.datetime(2003, 2, 4, 0, 0)]\n",
      "[datetime.datetime(2001, 3, 1, 0, 0)]\n",
      "[datetime.datetime(1994, 3, 1, 0, 0)]\n"
     ]
    }
   ],
   "source": [
    "# Use async for loop to stream data.\n",
    "async for chunk in alist_chain.astream({\"company\": \"Google\"}):\n",
    "    # Print each chunk and flush the buffer.\n",
    "    print(chunk, flush=True)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain-opentutorial-k6AU65mE-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
